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Syntax-Directed Translation 
Recursive Descent Parsing 
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Constructing a parse tree 



Outline 



Left Recursion 



Recursive descent 

• Algorithms 

• Limitations 
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Abstract 

SyntaxTrees 



So far a parser traces the derivation of a sequence of 
tokens 



The rest of the compiler needs a structural 
representation of the program (An actual data 
structure that tells what the operations in the program 
and how they are put together ) 

Abstract syntax trees 

• Like parse trees but ignore some details 

• Abbreviated as AST 
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Abstract 
Syntax Tree. 
(Cont.) 



Considerthe grammar 

E — > int | ( E ) | E + E 

• And the string 

5 + (2 + 3) 



After lexical analysis (a list of tokens) 

int 5 V v ( v int 2 Vint 3 T 



During parsing we build a parse tree ... 
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Example of 
Parse Tree 



E 

E 

int 5 ( 

E 

in 





Traces the operation of the parser 

Does capture the nesting structure 

But too much info 
• Parentheses 

E • Single-successor nodes 




in 

to 
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Example of 
Abstract 
Syntax Tree 




AST represent the same thing as the parse tree, but it 
compress out all the junk in the parse tree much simple 

AST also captures the nesting structure 

But abstracts from the concrete syntax 
=> more compact and easierto use 

An important data structure in a compiler 
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Summary 



We can specify language syntax using CFG 

• A parser will answer whether s e L(G) 

... and will build a parse tree 

... which will be converted to an AST 

... and pass on to the rest of the compiler 
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Intro to 
Top-Down 
Parsing: 
The Idea 












The parse tree is constructed 

• From the top 

• From left to right 



Terminals are seen in order of 
appearance in the token stream: 

"^2 ^5 ^6 ^8 ^9 



1 
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Recursive 

Descent 

Parsing 



• Considerthe grammar 
E^T|T + E 
T — » int | int *T | ( E ) 

Token stream is: ( int 5 ) 

Start with top-level non-terminal E 
Try the rules for E in order 
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E — »T |T + E 
T — > int I int *T 



Recursive 

Descent 

Parsing 



( int 5 ) 

t 
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E — >T |T + E 
T — > int I int *T 



Recursive 

Descent 

Parsing 



( int ) 

t 
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Recursive 

Descent 

Parsing ' 

int 

(mt 5 ) 



(E) 



Mismatch: int is not ( ! 
Backtrack ... 
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E — »T |T + E 
T — » int I int *T 



Recursive 

Descent 

Parsing 



( int 5 ) 

t 
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E — »T |T + E 
T — > int I int *T 



Recursive 

Descent 

Parsing 




( int ) 

t 





Mismatch: int is not ( ! 
Backtrack ... 
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E — »T |T + E 
T —» int I int *T 



Recursive 

Descent 

Parsing 



( int ) 

t 
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E — »T |T + E 
T — > int I int *T 



Recursive 

Descent 

Parsing 



E 

I 

T 

( E 



( int ) 

t 



(E) 



Match! Advance input 
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E — »T |T + E 
T — > int I int *T 



Recursive 

Descent 

Parsing ' 

( E 



( int ) 

t 




; 
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E — »T |T + E 
T — > int I int *T 



Recursive 

Descent T 

Parsing 

( E 

T 



( int ) 

t 







Dr. Sherin ElGokhy 



E 

Recursive ' 

Descent 

Parsing ^ E 

T 

int 



( int 5 ) 

t 



(E) 



Match! Advance input. 
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E — > T |T + E 



T — » int I int *T 



Recursive 

Descent 

Parsing 



( 



T 




( int 5 ) 



T 



int 



(E) 



Match! Advance input 
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Recursive 

Descent 

Parsing 



E 



T 

( E 

T 



( int ) 

t 



int 



(E) 






End of input, accept 
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Quiz 



Choose the derivation that is a valid recursive descent E 
parse for the string id + id in the given grammar, Moves f 
that are followed by backtracking are given in red, _c 

id 



E -> E' E' + E 





(E) E' -> -E' | id | (E) 



E' 


E’ + E 




E' + E 


* -E' + E 




id + E 


o id + E 




id + E' 


id + E' 


E 


id + id 


id + -E' 


E' 




id + id 


id 






E' + E 


E 




id + E 


E' + E 




id + E' 


id + E 


(X 




id + E' 






id + id 
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A Recursive 
Descent 
Parser. 


• LetTOKEN be the type of tokens 

Special tokens INT, OPEN, CLOSE, PLUS, TIMES 


Preliminaries 


Let the global next point to the next token 
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A (Limited) 
Recursive 
Descent 
Parser (2) 



Define boolean functions that check the token string 
fora match of 

A given token terminal 

bool term(TOKEN tok) { return *next++ == tok; } 
The n t[ production of S: 
bool S n () { ... } 

Try all productions of S: 
bool SQ £ ... } 
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A (Limited) 
Recursive 
Descent 
Parser (3) 



• For production E 

bool E^) { return T(); } 

• For production E — »T + E 

bool E 2 () { return T() && term(PLUS) && E(); } 

For all productions of E (with backtracking) 

bool E() { 

TOKEN ’""save = next; 
return (next = save, E 1 ()) 

|| (next = save, E 2 ()); } 
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A (Limited) 
Recursive 
Descent 
Parser (4) 



• Functions for non-terminalT 

bool T a () { return term(INT ); } 

boolT 2 () { return term(INT) && term(TIMES) &&T(); } 
bool T 3 () { return term(OPEN) && E() && term(CLOSE); } 

bool T() { 

TOKEN *save = next; 
return (next = save, T^)) 

|| (next = save, T 2 ()) 

|| (next = save, T 3 ()); } 
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Example 



E — »T |T + E ( int ) 

T — » int | int *T | ( E ) 

bool term(TOKEN tok) { return *next++ == tok; } 
bool E 1 () { return T(); } 

bool E 2 () { return T() && term(PLUS) && E(); } 

bool E() {TOKEN *save = next; return (next = save, E 1 ()) 

|| (next = save, E 2 ()); } 

booITjO { return term(INT); } 

boolT 2 () { return term(INT) && term(TIMES) &&T(); } 
booU 3 () { return term(OPEN) && E() && term(CLOSE); } 

boolT() {TOKEN *save = next; return (next = save, TJ)) 

|| (next = save, T 2 ()) 

|| (next = save, T 3 ());} 
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Quiz 

Which lines are incorrect in the 
recursive descent implementation of 
this grammar? 

E -> E' | E' + id 
E'->-E' | id | (E) 



□ Line 3 

□ Line 5 

□ Line 6 



□ Line 12 



1 bool term(TOKEN tok) { return *next++ == tok; } 

2 bool E^) { return E'Q; } 

3 bool E 2 () { return E'() && term(PLUS) && term(ID); } 

4 bool E() { 

5 TOKEN *save = next; 

6 return (next = save, E^)) && (next = save, E 2 ()); 

7 } 

8 bool £\() { return term(MINUS) && E'(); } 

9 bool E' 2 () { return term(ID); } 

10 bool E' 3 () { return term(OPEN) && E() && term(CLOSE); } 

11 bool E'() { 

12 TOKEN *next = save; return (next = save, T^)) 

13 1 1 (next = save, T 2 ()) 

14 1 1 (next = save, T 3 ()); 
15} 
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Recursive 

Descent 

Parsing. 

Notes. 



To start the parser 

Initialize next to point to first token 
• Invoke E() 

Notice how this simulates the example parse 
Easy to implement by hand 
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When 
Recursive 
Descent 
Does Not 

Work 

Limitations 



Try Examples 

int 



int*int 

If a production for non-terminal X succeeds, Recursive 
Descent cannot backtrack to try a different production 
forX 
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When 
Recursive 
Descent 
Does Not 

Work 

Limitations 



Easy to implement by hand 
But not completely general 
Cannot backtrack once a production is successful 

Works for grammars where at most one production 
can succeed for a non-terminal 
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When 
Recursive 
Descent 
Does Not 

Work 

Limitations 



• Consider a production S — » S a 

bool S 1 () { return S() && term(a); } 
bool S() { return S 1 (); } 

• S() goes into an infinite loop 

A left-recursive grammar has a non-terminal S 
S-> + Sa for some a 

Recursive descent does not work in such cases 
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Elimination 
of Left 
Recursion 



Considerthe left-recursive grammar 

S — ^ S a | (3 



S generates all strings starting with a (3 and followed 
by a number of a 



Can rewrite using right-recursion 

s->ps' 

S — ^ oc S I c 
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More 

Elimination 
of Left- 
Recursion 



In general 

S — ^ S o t 1 1 ... | S a n | | ... | (3 m 

All strings derived from S start with one of p 1/ ... # P m 
and continue with several instances of a ...,a n 

Rewrite as 

S — > Pi S f | ... | (3 m S' 

S' — » S' | ... | a n S # | e 
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General Left 
Recursion 



The grammar 

S — ^ A ot | 3 
A^Sp 

is also left-recursive because 

S — Spa 

This left-recursion can also be eliminated Report 

Due to Next Lecture 
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Quiz 

Choose the grammar that correctly 

eliminates left recursion from the given grammar: E — » E + T | T 




E E + id | E + (E) 
I id | (E) 



T -> id | (E) 

E -> TE' 

O E' -> + TE' | e 
T — > id | (E) 



E -» E' + T | T 
O E' — > id | (E) 

T ^ id | (E) 




id + E | E + T | T 
id | (E) 
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Summary 
of Recursive 
Descent 



Simple and general parsing strategy 
Left-recursion must be eliminated first 
... but that can be done automatically 

Unpopular because of backtracking 
Thought to be too inefficient 

In practice, backtracking is eliminated by 
restricting the grammar 
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I hanks 



